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Hack -proofing options Best route to embedded CPU encryption depends on 

the app . 

' i ci ! iio, Anchony 

K : ■- r r.fii c Engineering Times, 19 
2003 
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... said Richard Chesson, ST ' s director of marketing for multimedia 

platforms. Similarly, ARM has new operating modes for its forthcoming ARMll 
processor that act as parallel domains, but with a different privilege 
level 

To discourage snooping , MIPS has added the ability to "swizzle" 
inliorination traveling between the cache and the core, making it 
impossible to decipher data by probing the cache line. There's also a way 
to randomly inject stalls into the core, scrambling the power signatures, 
said MIPS CTO Mike Uhler. This requires the... 
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Patent watch. 
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... architecture for providing explicit multithreading. 

6,463,529: Compaq-Processor based system with system wide reset and 
partial system reset capabilities. 

6,4 63,580: Intel- Parallel processing utilizing highly correlated 
data values. 

Issued: October 1, 2002 

6,460,115: IBM-System and method for prefetching data to multiple 
levels of cache including selectively using a software hint to override 
a hardware prefetch mechanism. 

6,460,116: AMD-Using separate caches for variable and generated 
fixed-length instructions. 

6,460,129: Fuj itsu- Pipeline operation method and pipeline operation 
device to interlock the translation of instructions based. . . 
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Intel Turns Up The Heat : Pentium III shifts focus from price to 

performance . (Intel ' s Pentium III microprocessor) (Product Information) 

K i. s i:elhueber , Robert 
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... improve functions such as 3D graphics, where it has lagged AMD's 

3Dnow technology. It also improved both the floating point and the way the 
cache works internally. 



^^'r part, AMD has come up with TriLevel Cache to boost the 

: • •: : : .r.rince oi: ics new family. It consists of a 256KB L2 write-back 
cache '.pcjrdtinq at the full speed of the processor, complementing the 

.,:> idrd 64KB LI cache . It added a multiport internal cache design, 
• -ri.:::.n 1 ncj simultaneous 64-bit reads and writes to both the LI and L2 
caches . Finally, AMD added a lOOMHz frontside bus to the external cache 
, which can serve as a scalable Level 3 cache on the motherboard. 

The K6-III is positioned against the Pentium III, not the Pentium II, 
said Michael Steele, division marketing manager for the Computation. . . 
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Chips: AMD Introduces Indus try- leading AMD-K6-III Processor With 3DNow! 
Technology. (Product Announcement) 
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March 1, 1999 

DOCUMENT TYPE: Product Announcement LANGUAGE: English 

RECORD TYPE: Fulltext 
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TEXT: 

...it is exceptionally fast. The backside 256KB L2 cache of the 
AMD-K6-III processor operates at full processor speed. For example, the 
internal L2 cache of an AMD-K6-I I I /4 50 processor operates at a full 450 
M!:2. The TriLevel Cache design also offers an internal multiport cache 
design. This flexible design feature delivers higher system performance by 
enabling simultaneous 64 -bit reads and writes of both the LI cache 
and the L2 cache . In addition, each cache can be accessed 
simultaneously by the processor core. The AMD-K6-III processor with 3DNow! 
technology incorporates AMD's TriLevel Cache design to enable 
leading-edge performance for today's consumer PC enthusiasts and business 
power users. The 21.3-million transistor AMD-K6-III processor... 
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...cache, more than any other x86 CPU, and up to 1,344Kb in total. No 
ocher x86-compa tibles have the option of an external L3 cache on the 
rnocherboard. In comparison, the Pentium III has 32Kb of internal cache 
'-ind up CO 512Kb of half-speed external L2 cache , a total of 512Kb. AMD 
.-^iV.s ir,s Level 2 cache also runs at full processor clock speeds, and 
■ r.o,' "he design incorporates internal multiport caching , so that 
simultaneous 64 -bit reads and writes of both the LI and L2 caches 
ir.-' enabled. Compaq said it planned to use the new chips in a faster 
V'M.sion of its Presario internet PCs. The 0.25 micron, 21... 
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K7 Challenges Intel. (AMD K7 processor) (Product Development) 
Microprocessor Report, 12, 14, NA 
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... the queue, the store's data is forwarded to the load as soon as it 

becomes available, thus allowing the load to complete without a cache 
access . 

The cache is nonblocking, so when an LI miss goes to the L2 or the 
bus nor resolution, other loads behind it in the queue can access the 
:':ache. The data cache has three complete sets of tags, allowing 
simultaneous tag lookup by two requests from the queue and a snoop 

: : •. hie bus . 

The daca cache is physically tagged. Effective addresses are 
• r -jr.s : 6 r.ed co physical addresses in parallel with D-cache tag lookup by 
two - level translation lookaside buffer (TLB) . The first level has 
i-jlLy associative entries and is backed by a 256^entry, four-way 
se'c-associa c ive second level . The memory mapper supports both the 4K 
rji~id AM page sizes of Intel's 36-bit physical-address-space extension and 
the newer extended server... 
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... as the primary working memory for most PCs. 

FPM RAM Fast Page Mode RAM is an older standard for primary working 
""":f':v. FPM RAM cannot simultaneously seek and retrieve memory 
■ \ ''.' r^, dfid i'c has been largely replaced by extended data output RAM. 

LI Cache A very small, very fast static R]W cache , Level - 1 
Cache is located within the processor chip itself. For example, the Intel 
I -iw CPU has 16Kbytes of LI Cache . 

L2 Cache An external static RAM-based cache , Level - 2 Cache 
fielps feed memory contents to the main processor, which it does faster 
than DRAM, by preloading expected addresses. Typical L2 Cache sizes are 
between 256KB and 1MB. L2 Caches typically have either a pipeline-burst 
design or a faster, flow-through design. 

Cache RAM Cache is a small pool of high-speed memory that stores 
iiS'-a .likely to be requested next by the processor. See LI Cache , L2 
Cache 

SRAM Unlike DFIAM, Static RAM "remembers" bits without having to be 
::onsi:ani:ly refreshed. It is faster than DRAM, but more expensive. It is 
rypically used for L2 caches . 

SDRAM Synchronized Dynamic RAM is an emerging replacement for DRAM. 
SDRAM 's memory access cycles are synchronized with the main processor's 
clock to eliminate the processor's wait time between memory fetches . 
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Pentium desktops just got faster. (evaluation of four Pentium-based 
PCs) (includes related article on Universal Serial Bus design, and 
benchmarks used in testing) (Hardware Review) (Evaluation) 

Roger 

i-c User, ' n28 6, p28 (4 ) 



June 26, 1996 
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... A'-]GHX (formerly the Triton II). The 430VX supports synchronous 

iyn.i?nlL- RAM {SDRAM} and Concurrent PCI technology, which enables PCI and 
ISA buses to execute transactions simultaneously with no lag time. 

The 4 30HX supports Concurrent PCI, error checking and correcting 
(ECC) memory, dual processing and a shared second level cache that 
maximises EDO RAM. It also 'supports Unified Memory Architecture, which 
eliminates the need for dedicated display memory {PC User, 17 April 1996) . 

The 430VX. . . 
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200MHz Pentium PCs on the horizon. (Gateway 2000' s P5-200 PC, IBM's PC 300, 

Dell Computer's Dimension, HP's Vectra VL and NEC's PowerMate) (Product 

Announcement ) 

Di carlo, Lisa 

PC Week, vl3, nl7, pi (2) 

April 29, 1996 

DOCUMENT TYPE: Product Announcement ISSN: 0740-1604 LANGUAGE: 
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... ike uhe others, NEC will incorporate Intel's 430HX chip set and 

; • irrcel motherboard, code-named Cumberland, according to sources. 

4 ^:)HX supporcs Concurrent PCI, error- checking and-correcting 
:■/, dual processing and a shared Level 2 cache that maximizes 
i.enaed Daca Out RAM. Advanced PCs will be equipped with pipeline burst 
cache , 

Officials from Gateway 2000, Dell, IBM, HP, NEC and Intel declined to 
comment on unannounced products. 

Speed Thrills: 200MH2 desktop PCs due in June 
Vendor . . . 
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Hal reveals multichip SPARC processor. (Hal Computer Systems' Sparc64) 
(includes related article on price and availability) 
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... speeds branch handling. 

The fetch unit takes up to four instructions per cycle from the 
I :• '••.-h buffer. If the needed data is not in the prefetch buffer (due to 
■ s nt : spredicc ion ) , the fetch unit can instead read the data from the LI 
cache wich no delay. The LI cache is nonblocking and capable of 
ffoviding data at the same time that it is accepting data from the 
prefetch buffer. If an access misses the LI cache as well, the needed 
instructions must be fetched from the L2 cache , requiring a delay of 
three cycles. The deep queues in the execution engine typically mask this 
delay . 

As long as instructions are available, the fetch unit will read four 
arbitrarily aligned instructions and pass them to the issue unit. The only 
exception occurs if the end of a cache line is reached; because the fetch 



:;ru" cannot access two cache lines at once, it will finish one line, then 
oegin the next line on... 
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Storage Technology announcements: Storage Technology makes its case for 
Iceberg being the disk technology for the 1990s. (Product Announcement) 

Computergram International; nl848, pCGI01310007 

Jan 31, 1992 
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... bo data cransLers. The controller has eight Am29000 control 

I' rrr.t^sscrs which simultaneously process channel programs on eight separate 
• -rianr.G Is-. On Che back end of: che cache , XSA supports 16 unidirectional 
.idla paths to che disk arrays, eight in each direction. Two are dynamically 
assigned tor status and control, and the other 14 are used for transferring 
rompressed data. This parallel design enables XSA to stage and de-stage 

r\ between the cache and any 14 devices simultaneously and 
: r; J'ipenden t ly on channel activity. Each device has an actuator level 
r. Mrrer used for reading and writing. 

F'.xtended Performance Capabilities 

XSA uses caching techniques, non-volatile storage and 

cic'cua tor-level buffers to eliminate synchronous data transfers to and from 
physical devices. It is optimised for writing to disk... 
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Bits & PCs: your guide to all the latest and best PC products. 
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... DAS 2 is a new anti-hacking device from Kerridge Network Systems. 

It sits between the host computer and the modem and provides up to two 
secure ports. 

Features include four- level password security and call monitoring 
for both synchronous and asynchronous services. 

Units can be added in a cascading system to provide up to 64 ports. 
BT charge rates can be set to monitor calls. 1,395 pounds Fast facts: 297 

Compatibles 
" PC340 * Ti'Ko (0506) 857666 

Ti'Ko's latest 40MHz 386~based computer has a 128Kb cache and 
offers users a 43Mb hard disk, 3 1/2-inch and 5 1/4-inch floppy disk drives 
and a SuperVGA monitor. Other hard... 
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32-bit microprocessors. (1987 technology forecast) 

Bursky, Dave 
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. . . giving way to Harvard-type structures, which do more things in 

parallel, thus improving system efficiency. Multiple 32-bit address and 
data buses, on-chip caches for instructions and data, embedded 

>ry-[nanaqemenc units, and cranslation look -aside buffers operate in 
parallel with line CPU, thanks to large instruction queues and multiple 
: ■ . levels . The higher degree of parallelism cuts the execution time 

: : r.s; 1 1.2CI. ion, raising throughout. 

:.r:r on-chip register files -- from hundreds to thousands of bytes 
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... it is exceptionally fast. The backside 256KB L2 cache of the 

AMD-K6-III processor operates at full processor speed. For example, the 
internal L2 cache of an Aiy3D-K6-III/450 processor operates at a full 450 
MHz. 

The TriLevel Cache design also offers an internal multiport cache 
design. This flexible design feature delivers higher system performance by 
f-'iabling simultaneous 64-bit reads and writes of both the LI cache 
j:. : ' r.e L2 cache . In addition, each cache can be accessed 
simultaneously by the processor core. 

Abcu' the AMD-K6(R)-III Processor 
Le AMD-K6-III processor with 3DNow! technology incorporates AMD's 
" r ! Level Cache design to enable leading-edge performance for today's 
ronsumer PC enthusiasts and business power users. The 21.3-million 
uransiscor AMD-K6-III processor... 
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... offers SCSI Fast-Wide/Ultra-Wide and/or Fibre Channel connections. 

Other features include pro-OS (process-optimized Operating System), 
fifth-generation storage management software, read -ahead/write-behind 
r'laia caching (speeds I/O), pa rtit ionable data storage space, and 
multiple / concurrent RAID levels 0, 1, and 5. 

A.oh'sretta, Georgia-based Raidtec offers RAIDserver, another NAS-type 
rrjv:e line. The company claims that the server offers improved 
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... it is exceptionally fast. The backside 256KB L2 cache of the 

AMD-K6-III processor operates at full processor speed. For example, the 
internal L2 cache of an AMD-K6-III/450 processor operates at a full 450 
MHz. The TriLevel Cache design also offers an internal multiport cache 
design. This flexible design feature delivers higher system performance by 
enabling simultaneous 64-bit reads and writes of both the LI cache 
and the L2 cache . In addition, each cache can be accessed 
simultaneously by the processor core. About the AMD-K6-III Processor 

The AMD-K6-III processor with 3DNow! technology incorporates AMD's 
TriLevel Cache design to enable leading-edge performance for today's 
consumer PC enthusiasts and business power users. The 21.3-million 
transistor AMD-K6-III processor. . . 
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... Unlike some other systems on the market, Lucent's CPCI Speech 

Processing Board comes with the right to use its speech software without 
having to obtain individual licenses for each set of channels. 

The 6U high cPCI board is hot-swappable . The 32-bit bus handles a 133 
MBPS data bandwidth. Three RISC processors with 1 MB Level 2 caches 
are used (instead of DSPS) that sit on 64-bit memory buses. The board holds 
192 MB of SDRAM and supports 128 simultaneous channels. You'll find 
available onboard: 64 simultaneous channels of echo cancellation, 64 
sLmulcaneous channels of text-to-speech, 64 simultaneous channels 
^ impressed speech (16 bit linear LCCELP, ADPCM, mu-law or a... 
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. . . to its tag bits, process identifier bits, allowing lines to be 

invalidated by process to avoid a complete flush on a context switch. The 
data cache operates in copy-back mode with a write buffer, keeping the 
level of bus activity down. Both caches use streaming loads, so that a 
request that triggers a cache miss is met directly from the L2 cache , 
without waiting for the LI line to fill. 

The L2 cache also shows some interesting design ideas. It is 
direct-mapped, but write-through, rather than write-back. The on-chip L2 



controller manages the cache as a look -aside unit, sharing a common 
data bus with main memory. Thus, on an LI miss, the chip automatically 
starts both a DRAM access and an L2 access, and aborts the DRAM cycle if 
the L2 cache has the data. Using Fujitsu 32k-by-36 synchronous SRAMs, 
the cache line is four 64-bit words, contained in four 72-bit cache 
entries. The additional bits include two parity bits and the tag 
information. The L2 controller thus picks up the data and tag information 
i n i:he . . . 
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... Unlike the others, NEC will incorporate Intel's 430HX chip set and 

a second Intel motherboard, code-named Cumberland, according to sources. 

The 4 30HX supports Concurrent PCI, error- checking and-correcting 
[uemory, dual processing and a shared Level 2 cache that maximizes 
Extended Data Out RAM. Advanced PCs will be equipped with pipeline burst 
cache . 

Officials from Gateway 2000, Dell, IBM, HP, NEC and Intel declined to 
comment on unannounced products. 

Speed Thrills: 200MH2 desktop PCs due in June 

Vendor . . . 
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. . . The access time of the R4000 secondary cache is programmable as an 

inceqral number of processor clock cycles to support SRAMs of varying 
speeds. Larger caches can be built with slower and less expensive SRAMs, 
and Che processor clock speed can be increased without affecting the 
secondary-cache design. The designer can therefore target various 
price/performance levels. 
O/S considerations 

While virtually indexed, physically tagged caches allow parallel 
cache interrogation and translation look -aside buffer (TLB) lookup , 
physical caches simplify sharing of data with different virtual 
addresses. The R4000 incorporates virtually indexed first - level cache 
and physically tagged and indexed second - level cache to eliminate 
"aliasing problems," which add complexity to the operating-system software. 

For maintaining coherency among private caches in a bus-based 
mul c i processing system, snooping must be carried out in the secondary 
cache . The proper subset property must be maintained between the first and 
svH;ond cache levels. The R4 000 maintains that subset property and cache 
.-oherency by keeping the cache state information (dirty, shared, etc.) in 
both Che first- and second... 
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... a highly graphical user interface. The interface is flexible; 

designers can move windows or resize them at will. With the floating-point 
capability, designers can look at register or memory contents in 
floating-point formats . 

Other standard features include support for cache bursts and 
synchronous cycles, both source-and symbolic- level debugging, and 
zero-wait-state high-speed RAM overlay. 

Pricing for the development system starts at $30,000. 

Jay Maggard, (206) 882-2000 

Reader Service . . . 
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... and double-extended data types. The main datapath consists of a 

10-stage core pipeline that can execute up to six IA-64 instructions in 
parallel during each clock cycle. A large on-chip register file provides 
programmers with 128 integer, 128 floating-point, 64 one-bit predicate, and 
eight branch registers (Fig. 1 ). Level - 1 and level - 2 caches are 
integrated on the chip. A level-3 cache interface can address up to 4 
Mbytes over a dedicated back-side bus that employs a source- synchronous 
interface. Both level - 2 and -3 caches include error checking and 
correction to ensure data integrity. 

The processor was designed to handle multiple operations in parallel, 
so it includes these 11 execution units: four integer... 
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. . . second half . 

By the end of 1998, AMD will offer the K6-3 product line with a 
superscalar instruction set and nearly double the K6- 2 's Level 1 
cache . It will be positioned against Intel's Katmai processor. AMD 
officials added that the K-7 would be ready at about the same time 
Intel's Merced hits the market, but that was before Intel's May 29 



announcement of delays in the Merced schedule. 
Herb lectured about 100... 
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... information from the Level 2 cache. 

I i: the external cache doesn't contain the required data, it makes 
lictle difference what kind the Level 2 cache is; the motherboard chip 
s<ri must do a slow and arduous fetch from main memory. But if the external 
cache does contain the needed data, it must send it to the CPU in a rapid 
r.jrsL oi four or more chunks. (This operation is often called a burst 
fill because it fills the Level 1 cache from the Level 2 cache 
.) This is when external cache architecture makes a difference. 

A synchronous external cache can always furnish data as fast as 
the CPU asks for it. This is the fastest kind, but it requires superfast 
RAM and is therefore very expensive. Second best is a pipelined burst 
cache . This type of cache takes a bit longer than a synchronous cache 
to retrieve the first chunk of data but prepares the next while the 
previous one is being sent so that no time is lost after... 
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16-word data block can be copied to the write-back buffer. 

The architecture defines the unit of coherency as a 64byte {16-word) 
Jine cache block. To ensure cache coherency the data cache supports 
the four-state MESI (modified/exclusive/shared/invalid) protocol. 

Another feature which separates the 620 from its predecessors is the 
use of an on-chip L2 cache controller which supports configurations 
from 1Mbyte to 128Mbyte using the same 64byte block size as the internal 
LI caches . The L2 cache is a direct-mapped error correction code 
protected unified instruction and secondary data cache . It supports the 
use of single and double register synchronous SRAMs and the I42 
interface supports a number of external SRAM access speeds as well as an 
external coprocessor, and the direct-mapped error correction code. 

It is a superscalar microprocessor which fetches , dispatches and 
completes up to four instructions per cycle. In order to keep its multiple 
pipelines flowing the 620 uses the prediction of branch instructions... 
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. . . Nx586 processor delivers higher throughputs than the Intel Pentium 

when running at the same clock speeds. The secret is a RISC core combined 
with a second - level cache controller, 16-kbyte data and instruction 
caches (four-way set associative), and a proprietary coprocessor interface 
to an off-chip floating-point unit. Its fetch stack allows four 
simultaneous level - 2 to level - 1 cache replacements. In addition, 
a wri te-reservation queue supports out-of-order instruction execution and 
in-order retirement, which helps maximize software throughput. 
By using a . . . 
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. . . the processor only on a snoop hit. On a snoop miss, it is 

guaranteed that the cycle does not affect data in the first-level cache , 
so it is unnecessary to propagate the cycle. This shields the processor 
from the heavy bus traffic typical of multiprocessor systems. 
Guaranteed inclusion is implemented. . . 

...in the second-level cache, or when both the first- and second-level 
caches are being loaded simultaneously from system memory. This ensures 
iihat Che first - level cache can only obtain data present in the second 
- level cache. Invalidation cycles are run on the first - level cache 
whenever data is invalidated or replaced in the second - level cache. 
This ensures that data will not persist in the first - level cache after 
it has ceased to exist in the second - level cache. It also couples the 
first - level cache to the MESI coherency protocol implemented among the 
second -level caches, which ensures system-wide cache coherency. 
Conclusion 

The S3 chip set is an elegant solution for Intel-based multiprocessor 
systems. The S3 design provides the necessary features in a form... 
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...TEXT: to its tag bits, process identifier bits, allowing lines to be 
invalidated by process to avoid a complete flush on a context switch. The 



claca cache operates in copy-back mode with a write buffer, keeping the 
• level of bus activity down. Both caches use streaming loads, so that a 
request that triggers a cache miss is met directly from the L2 cache , 
without waiting for the LI line to fill. 

The L2 cache also shows some interesting design ideas. It is 

direct-mapped, but writethrough, rather than write-back. The on-chip L2 
controller manages the cache as a look -aside unit, sharing a common 
data bus with main memory. Thus, on an LI miss, the chip automatically 
starts both a DRAM access and an L2 access, and aborts the DRAM cycle if 
the L2 cache has the data. Using Fujitsu 32k-by36 synchronous SRAMs, 
the cache line is four 64~bit words, contained in four 72-bit cache 
entries. The additional bits include two parity bits and the tag 
information. The L2 controller thus picks up the data and tag information 
in the . . . 
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... said Richard Chesson, ST ' s director of marketing for multimedia 

platforms. Similarly, ARM has new operating modes for its forthcoming 
ARMll processor that act as parallel domains, but with a different 
privilege level . 

To discourage snooping , MIPS has added the ability to "swizzle" 
information traveling between the cache and the core, making it 
impossible to decipher data by probing the cache line. There's also a 
way CO randomly inject stalls into the core, scrambling the power 
signatures, said MIPS CTO Mike Uhler. This requires the... 
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second half. 

By the end of 1998, AMD will offer the K6-3 product line with a 
superscalar instruction set and nearly double the K6- 2 's Level 1 
cache . It will be positioned against Intel's Katmai processor. AMD 
officials added that the K-7 would be ready at about the same time 
Intel's Merced hits the market, but that was before Intel's May 29 
announcement of delays in the Merced schedule. 



Herb lectured about 100. 
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... flush on a context switch. The data cache operates in copy-back 

fnocie w.i.uh a write buffer, keeping the level of bus activity down. Both 
caches use streaming loads, so that a request that triggers a cache 
m.iss is mec directly from the L2 cache , without waiting for the LI line 
•CO fill. 

The L2 cache also shows some interesting design ideas. It is 
direct-mapped, but write-through, rather than write-back. The on-chip L2 

controller manages the cache as a look -aside unit, sharing a common 
data bus with main memory. Thus, on an LI miss, the chip automatically 
starts both a DRAM access and an L2 access, and aborts the DRAM cycle if 
the L2 cache has the data. Using Fujitsu 32k-by-36 synchronous 
SRAMs, the cache line is four 64-bit words, contained in four 72-bit 
cache entries. The additional bits include two parity bits and the tag 
information. The L2 controller thus picks up the data and tag information 
In I- he. . . 
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... a highly graphical user interface. The interface is flexible; 

designers can move windows or resize them at will. With the floating-point 
capability, designers can look at register or memory contents in 
floating-point formats. 

Other standard features include support for cache bursts and 
synchronous cycles, both source- and symbolic- level debugging, and 
zero-wait-state high-speed RAM overlay. 

Pricing for the development system starts at $30,000. 

Jay Maggard, (206) 882-2000 
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. . . The access time of the R4000 secondary cache is programmable as an 

integral number of processor clock cycles to support SRAMs of varying 
speeds. Larger caches can be built with slower and less expensive SRAMs, 
and the processor clock speed can be increased without affecting the 
secondary-cache design. The designer can therefore target various 
price/performance levels. 
0/S considerations 

While virtually indexed, physically tagged caches allow parallel 
cache interrogation and translation look -aside buffer (TLB) lookup , 
physical caches simplify sharing of data with different virtual 
addresses. The R4000 incorporates virtually indexed first - level cache 

and physically tagged and indexed second - level cache to eliminate " 
aliasing problems," which add complexity to the operating-system software. 

For maintaining coherency among private caches in a bus-based 
multiprocessing system, snooping must be carried out in the secondary 
cache . The proper subset property must be maintained between the first 
and second cache levels. The R4000 maintains that subset property and 
cache coherency by keeping the cache state information (dirty, shared , 
^'^::.) in both the first- and second... 



